Clustering Student Learning Activity Data

نویسنده

  • Haiyun Bian
چکیده

We show a variety of ways to cluster student activity datasets using different clustering and subspace clustering algorithms. Our results suggest that each algorithm has its own strength and weakness, and can be used to find clusters of different properties. 1 Background Introduction Many education datasets are by nature high dimensional. Finding coherent and compact clusters becomes difficult for this type of high dimensional data. Subspace clustering was proposed as a solution to this problem [1]. Subspace clustering searches for compact clusters embedded within subsets of features, and it has proven its effectiveness in domains that have high dimensional datasets similar to educational data. In this paper, we will show that different clustering and subspace clustering algorithms produce clusters of different properties, and all these clusters help the instructor assess their course outcomes from various perspectives. 2 Clustering Student Activity Data We assume that datasets are in the following format: each row represents one student record, and each column measures one activity that students participate in. Our test data contains 30 students with 16 activities, and 7 students failed this class. The final grade is the weighted average from the scores in all 16 activities. 2.1 Student clusters Student clusters consist of groups of students who demonstrate similar learning curves throughout the whole course. These clusters are helpful to identify key activities that differentiate successful students from those who fail the course. We applied the SimpleKMeans from Weka [2] to the test dataset with k being set to 2. The results show that cluster1 contains 6 out of 7 students who failed the course, and cluster2 contains 24 students among whom 23 passed the course. One student who failed was clustered into cluster2, and we found out that this student’s composite final score is 58%, which lies right on the boundary of passing/failing threshold. 2.2 Activity clusters Here we focus on finding groups of activities in which all students demonstrate similar performance. For example, we may find a group of activities where all students show

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering Students for Group-Based Learning in Foreign Language Learning

Big data make it possible to mine learning information for insights regarding student performance in foreign language learning (FLL). Group-based learning is a usual method to improve FLL, whose effectiveness is greatly influenced by student groups. The general grouping method is to divide students into groups by their teacher manually, which is not timely or accurate. To overcome the shortcomi...

متن کامل

A Preliminary Study on Clustering Student Learning Data

Clustering techniques have been used on educational data to find groups of students who demonstrate similar learning patterns. Many educational data are relatively small in the sense that they contain less than a thousand student records. At the same time, each student may participate in dozens of activities, and this means that these datasets are high dimensional. Finding meaningful clusters f...

متن کامل

High-Dimensional Unsupervised Active Learning Method

In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...

متن کامل

Cluster analysis of student activity in a web-based intelligent tutoring system

In this paper we present a model of a system for integration of an intelligent tutoring system with data mining tools. The purpose of the integration is twofold; a) to power the system adaptability based on clustering and sequential pattern mining, and b) to enable teachers (non-experts in data mining) to use data mining techniques in their web browser on a daily basis, and get useful visualiza...

متن کامل

Mining Student data by Ensemble Classification and Clustering for Profiling and Prediction of Student Academic Performance

Applying Data Mining (DM) in education is an emerging interdisciplinary research field also known as Educational Data Mining (EDM). Ensemble techniques have been successfully applied in the context of supervised learning to increase the accuracy and stability of prediction. In this paper, we present a hybrid procedure based on ensemble classification and clustering that enables academicians to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010